AnCoraPipe: A new tool for corpora annotation

نویسندگان

  • Manuel Bertran
  • Oriol Borrega
  • M.Antònia Martí
  • Mariona Taulé
  • M. Antònia Martí
چکیده

This paper describes AnCoraPipe, an environment for the creation, edition and analysis of linguistic corpora and lexicons. AnCoraPipe has been used in the development of different linguistic resources: AnCora, CesCa, ClInt, Amazighe corpora, and the verbal and nominal AnCora lexicons. We present the functionalities of AnCoraPipe, the way in which the data and metadata is structure, as well as some implementation details.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

AnCoraPipe: A tool for multilevel annotation

AnCoraPipe is a corpus annotation tool which allows different linguistic levels to be annotated simultaneously and efficiently, since it uses a single format for all stages. In this way, the required annotation time is reduced and the integration of the work of all annotators is made easier.

متن کامل

A New Annotation Tool for Aligned Bilingual Corpora

This paper presents a new annotation tool for aligned bilingual corpora, which allows the annotation of a wide range of information, ranging from information about words (such as part-of-speech tags or named-entities) to quite complex annotation schemas involving links between aligned segments, such as co-reference or translation equivalence between aligned segments in the two languages. The an...

متن کامل

Coreference Annotator - A new annotation tool for aligned bilingual corpora

This paper presents the main features of an annotation tool, the Coreference Annotator, which manages bilingual corpora consisting of aligned texts that can be grouped in collections and subcollections according to their topics and discourse. The tool allows the manual annotation of certain linguistic items in the source text and their translation equivalent in the target text, by entering usef...

متن کامل

MMAX: A Tool for the Annotation of Multi-modal Corpora

We present a tool for the annotation of XMLencoded multi-modal language corpora. Nonhierarchical data is supported by means of standoff annotation. We define base level and suprabase level elements and theory-independent markables for multi-modal annotation and apply them to a cospecification annotation scheme. We also describe how arbitrary annotation schemes can be represented in terms of the...

متن کامل

SegProso: A Praat-Based Tool for the Automatic Detection and Annotation of Prosodic Boundaries in Speech Corpora

In this paper we describe SegProso, a Praat-based tool for the automatic segmentation in prosodic units of speech corpora. It is made up of a set of Praat scripts that add several tiers, each one containing the segmentation of a different unit, to a previously existing TextGrid file including the phonetic segmentation of the associated wav file. It has been successfully used for the annotation ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011